cumulative utility
Strategizing against No-regret Learners
Deng, Yuan, Schneider, Jon, Sivan, Balusubramanian
How should a player who repeatedly plays a game against a no-regret learner strategize to maximize his utility? We study this question and show that under some mild assumptions, the player can always guarantee himself a utility of at least what he would get in a Stackelberg equilibrium of the game. When the no-regret learner has only two actions, we show that the player cannot get any higher utility than the Stackelberg equilibrium utility. But when the no-regret learner has more than two actions and plays a mean-based no-regret strategy, we show that the player can get strictly higher than the Stackelberg equilibrium utility. We provide a characterization of the optimal game-play for the player against a mean-based no-regret learner as a solution to a control problem. When the no-regret learner's strategy also guarantees him a no-swap regret, we show that the player cannot get anything higher than a Stackelberg equilibrium utility.
NoveltyBench: Evaluating Language Models for Humanlike Diversity
Zhang, Yiming, Diddee, Harshita, Holm, Susan, Liu, Hanchen, Liu, Xinyue, Samuel, Vinay, Wang, Barry, Ippolito, Daphne
Language models have demonstrated remarkable capabilities on standard benchmarks, yet they struggle increasingly from mode collapse, the inability to generate diverse and novel outputs. Our work introduces NoveltyBench, a benchmark specifically designed to evaluate the ability of language models to produce multiple distinct and high-quality outputs. NoveltyBench utilizes prompts curated to elicit diverse answers and filtered real-world user queries. Evaluating 20 leading language models, we find that current state-of-the-art systems generate significantly less diversity than human writers. Notably, larger models within a family often exhibit less diversity than their smaller counterparts, challenging the notion that capability on standard benchmarks translates directly to generative utility. While prompting strategies like in-context regeneration can elicit diversity, our findings highlight a fundamental lack of distributional diversity in current models, reducing their utility for users seeking varied responses and suggesting the need for new training and evaluation paradigms that prioritize diversity alongside quality.
Remember, but also, Forget: Bridging Myopic and Perfect Recall Fairness with Past-Discounting
Dynamic resource allocation in multi-agent settings often requires balancing efficiency with fairness over time--a challenge inadequately addressed by conventional, myopic fairness measures. Motivated by behavioral insights that human judgments of fairness evolve with temporal distance, we introduce a novel framework for temporal fairness that incorporates past-discounting mechanisms. By applying a tunable discount factor to historical utilities, our approach interpolates between instantaneous and perfect-recall fairness, thereby capturing both immediate outcomes and long-term equity considerations. Beyond aligning more closely with human perceptions of fairness, this past-discounting method ensures that the augmented state space remains bounded, significantly improving computational tractability in sequential decision-making settings. We detail the formulation of discounted-recall fairness in both additive and averaged utility contexts, illustrate its benefits through practical examples, and discuss its implications for designing balanced, scalable resource allocation strategies.
AMUSE: Adaptive Model Updating using a Simulated Environment
Chislett, Louis, Vallejos, Catalina A., Cannings, Timothy I., Liley, James
Prediction models frequently face the challenge of concept drift, in which the underlying data distribution changes over time, weakening performance. Examples can include models which predict loan default, or those used in healthcare contexts. Typical management strategies involve regular model updates or updates triggered by concept drift detection. However, these simple policies do not necessarily balance the cost of model updating with improved classifier performance. We present AMUSE (Adaptive Model Updating using a Simulated Environment), a novel method leveraging reinforcement learning trained within a simulated data generating environment, to determine update timings for classifiers. The optimal updating policy depends on the current data generating process and ongoing drift process. Our key idea is that we can train an arbitrarily complex model updating policy by creating a training environment in which possible episodes of drift are simulated by a parametric model, which represents expectations of possible drift patterns. As a result, AMUSE proactively recommends updates based on estimated performance improvements, learning a policy that balances maintaining model performance with minimizing update costs. Empirical results confirm the effectiveness of AMUSE in simulated data.
Dynamics of the Ride-Sourcing Market: A Coevolutionary Model of Competition between Two-Sided Mobility Platforms
Ghasemi, Farnoud, Drabicki, Arkadiusz, Kucharski, Rafał
There is a fierce competition between two-sided mobility platforms (e.g., Uber and Lyft) fueled by massive subsidies, yet the underlying dynamics and interactions between the competing plat-forms are largely unknown. These platforms rely on the cross-side network effects to grow, they need to attract agents from both sides to kick-off: travellers are needed for drivers and drivers are needed for travellers. We use our coevolutionary model featured by the S-shaped learning curves to simulate the day-to-day dynamics of the ride-sourcing market at the microscopic level. We run three scenarios to illustrate the possible equilibria in the market. Our results underline how the correlation inside the ride-sourcing nest of the agents choice set significantly affects the plat-forms' market shares. While late entry to the market decreases the chance of platform success and possibly results in "winner-takes-all", heavy subsidies can keep the new platform in competition giving rise to "market sharing" regime.